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The invention relates to methods for assaying exogenous protease activity in a host cell transformed with nucleotide sequences 
encoding mat protease and a specialized substrate. It also relates to methods for assaying endogenous protease activity in a host cell 
transformed with nucleotide sequences encoding a specialized substrate. When these nucleotide sequences are expressed, the exogenous 
or endogenous protease cleaves the substrate and releases a polypeptide that is secreted out of the cell where it can be easily quantitated 
using standard assays. The methods and transformed host cells of this invention are particularly useful for identifying inhibitors of the 
exogenous and endogenous proteases. If the protease is a protease from an infectious agent, inhibitors identified by these methods are 
potential pharmaceutical agents for the treatment or prevention of infection by that agent 
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METHODS, NUCLEOTIDE SEQUENCES AND HOST CELLS FOR 
ASSAYING EXOGENOUS AND ENDOGENOUS PROTEASE ACTIVITY 

TECHNICAL FIELD OF INVENTION 
5 The invention relates to methods for assaying 

exogenous protease activity in a host cell transformed 
with nucleotide sequences encoding that protease and a 
specialized substrate. It also relates to methods for 
assaying endogenous protease activity in a host cell 
10 transformed with nucleotide sequences encoding a 

specialized substrate. When these nucleotide sequences 
are expressed, the exogenous or endogenous protease 
cleaves the substrate and releases a polypeptide that is 
setrreted out of the cell, where it can be easily 
15 quantitated using standard assays. The methods and 

transformed host cells of this invention are particularly 
useful for identifying inhibitors of the exogenous and 
endogenous proteases. If t he protease is a protease from 
an infectious agent or is characteristic of a diseased 
state, inhibitors identified by these methods are 
potential pharmaceutical agents for treatment or 
prevention of the disease. 

BACKGROUND ART 

Proteases play an important role in the 
25 regulation of many biological processes. They also play 
a major role in disease.. In particular, proteolysis of 
primary polypeptide precursors is essential to the 
replication of several infectious viruses, including HIV 
and HCV. These viruses encode proteins that are 
initially synthesized as large polyprotein precursors 
Those precursors are ultimately processed by the viral 
protease to mature viral proteins. In light of this, 
researchers have begun to concentrate on inhibition of 
viral proteases as a potential treatment for certain 
35 viral diseases. 

Proteases also play a role in non-infectious 
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diseases. For example, changes in normal cellular 
function may cause an undesirable increase or decrease in 
proteolytic activity. This often leads to a disease 
state. 

5 The ability to detect viral or mutant protease 

activity in a quick and simple assay is important in the 
biochemical characterization of these proteases and in 
the screening and identification of potential inhibitors. 
Several of these assays have been described in the art. 

T. M. Block et al., Antimicrob. Agents 
Chemother . , 34, pp. 2337-41 (1990) described a prototype 
assay for screening potential HIV protease inhibitors. 
This assay involved cloning the HIV protease recognition 
sequence into the tetracycline resistance gene (Tet R ) of 
PBR322 and cotransfroming E. coli with the modified Tet R 
gene and the gene encoding the HIV protease. Co- 
expression of these two genes caused tetracycline 
sensitivity. Potential inhibitors were identified by the 
ability to restore tetracycline resistance to the 
20 transformed bacteria. 

E. Sarubbi et al., FEBS Lett. . 279, pp. 265-69 
(1991) described another assay for detecting HIV protease 
inhibitors that utilized a HIV-1 Gag-fl-galactosidase 
fusion protein and a monoclonal antibody that bound to 
the fusion protein in the gag region. Coexpression of 
the HIV protease and the fusion protein lead to cleavage 
of the latter and abolished monoclonal antibody binding. 
Potential inhibitors were identified by increased binding 
of the monoclonal antibody to the fusion protein. 

T. A. Smith et al., Proc. Natl. Acad. Sci. USA . 
88, pp. 5159-62 (1991), B. Dasmahapatra et al., Proc. 
Natl. Acad. Sci. USA. 89, pp. 4159-62 (1992) and M. G. 
Murray et al., Gene , 134, pp. 123-28 (1993) each 
described protease assay systems utilizing the yeast GAL4 
protein. Each of these authors described inserting a 
protease cleavage site in between the DNA binding domain 
and the transcriptional activating domain of GAL4. 
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Cleavage of that site by a coexpressed protease renders 
GAL4 transcriptionally inactive leading to the inability 
of the transformed yeast to metabolize galactose. 

H.-D. Liebig et al., Proc. Natl. Acad. Sci. 
5 USA, 88, pp. 5979-83 (1991) disclosed the use of a fusion 
protein consisting of a self -cleaving protease fused to 
the a fragment of fi-galactosidase to assay protease 
activity. Active forms of the protease cleaved 
themselves off of the fusion protein and the resulting 
protein was able to carry out a-complementation. Fusions 
containing inactive protease were unable to perform a- 
complementation . 

Y. Komoda et al., J. Virol. , 68, pp. 7351-57 
(1994) described an assay to identify HCV protease 
15 cleavage sites within the HCV precursor polyprotein. 
These authors created chimeric proteins comprising 
various portions of the HCV precursor polyprotein 
inserted in between the E. coli maltose binding protein 
and dihydrofolate reductase. If the HCV portion of *hese 
chimeras contained a cleavage site, the chimera would be 
cleaved when it was coexpressed with HCV protease in E. 
coli. Cleavage of the chimera was determined by SDS- 
polyacrylamide gel electrophoresis of E. coli lysates. 

Y. Hirowatari et al., Anal. Biochem. , 225, pp. 
25 113-120 (1995) described another assay to detect HCV 
protease activity. In this assay, the substrate, HCV 
protease and a reporter gene are cotransfected into COS 
cells. The substrate is a fusion protein consisting of 
(HCV NS2)-(DHFR)-(HCV NS3 cleavage site) -Taxi. The 
reporter gene is chloramphenicol transferase (CAT) under 
control of the HTLV-1 long terminal repeat (LTR) and 
resides in the cell nucleus following expression. The 
uncleaved substrate is expressed as a membrane-bound 
protein on the surface of the endoplasmic reticulum due 
35 to the HCV NS2 portion. Upon cleavage, the released Taxi 
protein translocates to the nucleus and activates CAT 
expression by binding to the HTLV-1 LTR. Protease 
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activity is determined by measuring CAT activity in a 
cell lysate. 

Despite these developments, no one has yet 
developed a protease assay system that can be carried out 
5 with higher eukaryotic cells and is both quantitative and 
does not require cell lysis prior to quantitation. 
Avoiding cell lysis prior to quantitation is desirable in 
that the assay may be performed more rapidly and with 
less manipulation. Also, lysis can often lead to 
10 aberrant results. Thus, there is a need for . an accurate 
and quantitative cellular-based protease assay that can 
be carried out in a higher eukaryotic cell without cell 
lysis. 

SUMMARY OF THE TNVEMTTOM 

15 The present invention fulfills this need by 

providing methods for assaying exogenous protease 
activity in a host cell expressing that protease. The 
methods involve utilizing a host cell expressing a first 
nucleotide sequence encoding an exogenous protease and a 

20 second nucleotide sequence encoding an artificial 

substrate for that protease. The artificial substrate 
comprises a cleavage site for the protease situated at or 
near the natural maturation site of a pre-polypeptide, 
part of which is secreted following proteolytic 

25 processing. When the host is grown under conditions that 
cause expression of the first and second nucleotide 
sequences, the exogenous protease cuts the artificial 
substrate at the cleavage site, releasing the mature 
polypeptide which is secreted into the growth media. The 

30 growth media is then isolated and assayed for the mature 
polypeptide. 

Alternatively, the invention may be utilized to 
assay endogenous proteases, especially when quantitation 
of those proteases is difficult due to the inability to 

35 detect or distinguish between the cleaved and uncleaved 
native substrate. 
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According to one aspect of the invention, the 
assay is used to quantitate an exogenous viral protease* 
Such assays are particularly useful as replacements for 
current viral protease assays that require the use of 
5 intact, infectious virus or where no simple viral model 
is available to detect viral protease activity. These 
assays may be used to identify and assay potential 
inhibitors of viral proteases which, in turn, may be used 
as pharmaceutical agents for the treatment or prevention 
10 of viral disease. 

This invention also provides host cells 
transformed with nucleotide sequences encoding an 
endogenous protease and a corresponding substrate, as 
well as those transformed with a specialized substrate 
15 for an endogenous protease. These hosts may be used in 
the methods of this invention. 

BRIEF DESCRIPTION OF THE DRAWINGS 

figure 1 depicts the structure of pcDL-SRa296. 
Figure 2 depicts the structure of a derivative 
of pKV containing the pre-IL-lB coding sequence. 

Figure 3, panel A, is an immunoblot of cell 
lysates from cells transfected with a NS3-wild-type or 
NS3-mutant NS3-4A-4B-IL1B or cotransf ected with a NS3- 
mutant NS3-4A-4B-IL1B and a NS3 (1-180) construct probed 
25 with an anti-NS3 antibody. Figure 3, panel B, is an 

immunoblot of the same cell lysates probed with an anti- 
IL-1B antibody. . 

Figure 4 depicts the immunoprecipitation of the 
media from 35 S-labelled cells transfected with either a 
30 NS3-wild-type or NS3-mutant NS3-4A-4B-IL1B construct with 
an anti-IL-lB antibody. 

Figure 5 is an immunoblot of cell lysates from 
cells co-transfected with NS3-4A and either a NS5A/5B- or 
CSM-containing pre-ILIB substrate probed with an anti-IL- 
35 IB antibody. 

Figure 6 depicts the immunoprecipitation of the 
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media from "S-labelled cells co-transf ected with NS3-4A 
and either a NS5A/5B- or CSM-containing pre-ILIB 
substrate with an anti-IL-lB antibody. 

Figure 7 depicts the inhibition of HCV NS3 
5 protease cleavage of pre-IL-lB* by varying concentratior 
of VH16075 and VH15924. 



20 



DETAILED DESCRIPTION OF THE INVENTION 

The present invention provides a method for 
assaying exogenous protease activity in a host cell 
10 comprising the steps of: 

(a) incubating a host cell transformed 
with a first nucleotide sequence encoding an exogenous 
protease and a second nucleotide sequence encoding an 
artificial polypeptide substrate under conditions which 
15 cause said exogenous protease and said artificial 
substrate to be expressed; 

wherein said substrate comprises: 

(i) a cleavage site for said 
exogenous protease; and 

(ii) a polypeptide that is secreted 
out of said cell following cleavage by said 
exogenous protease; 

(b) separating said host cell from its 
growth media under non-lytic conditions; and 
25 (c) assaying said growth media for the 

presence of said secreted polypeptide. 

As used herein, the term "exogenous protease" 
means a protease not normally expressed by the host cell 
used in the assay. That term includes full-length 
proteases that are identical to those found in nature, as 
well as catalytically active fragments thereof. 

The choice of exogenous protease to be assayed 
is solely dependent upon the decision of the user. The 
only requirements are that: (1) the specificity of the 
35 enzyme in terms of what amino acid residues or sequences 
it cleaves at be known; (2) the primary structure of at 
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least the catalytically active portion of the enzyme be 
known; and (3) a nucleotide sequence encoding at least an 
enzymatically active portion of the protease exists or 
can be made and can be expressed in a heterologous host 
5 cell. 

According to a preferred embodiment, the 
exogenous protease is a protease encoded by a pathogenic 
agent. More preferred is a protease encoded by a 
pathogenic virus. Most preferably, the exogenous 
10 protease is the NS3 protease of hepatitis C virus 
("HCV"). 

HCV NS3 protease is a 70 kilodalton protein 
that is involved in the maturation of viral polypeptides 
following infection. It is a serine protease which has a 

15 Cys-X or Thr-X substrate specificity. It has also been 
shown that the protease activity of NS3 resides 
exclusively in the N- terminal 180 amino acids of the 
enzyme. Therefore, nucleotide sequences encoding 
anywhere from the first 180 amino acids of NS3 up to the 

20 full length enzyme may be utilized in the methods of this 
invention. Active fragments of other known proteases may 
also be used as an alternative to the full-length 
protease. 

According to an alternative embodiment, the 
25 invention provides a method for assaying endogenous 

protease activity in a host cell comprising the steps of: 

a) incubating a host cell transformed with a 
nucleotide sequence encoding an artificial polypeptide 
substrate under conditions which cause said artificial 

30 substrate to be expressed; 

wherein said substrate comprises: 

i) a cleavage site for said endogenous 
protease; and 

ii) a polypeptide that is secreted out of 
35 said cell following cleavage by said endogenous 

protease; 

b) separating said host cell from its growth 
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media under non-lytic conditions; and 

c) assaying said growth media for the 
presence of said secreted polypeptide. 

The term "endogenous protease", as used 
5 throughout this application, refers to a proteases that 
is normally expressed by the host cell. It includes both 
wild type proteases, as well as naturally occurring 
mutant proteases with increased or decreased activity. 

According to the invention, tht artificial 
10 polypeptide substrate used in the methods must comprise a 
cleavage site for the protease to be assayed; and must be 
secreted out of the cell following cleavage by that 
protease, Preferably, the DNA encoding the artificial 
substrate is derived from a gene or cDNA encoding a 
15 naturally occurring polypeptide that is normally cleaved 
and then secreted out of a cell, but not necessarily 
cleaved by the cell utilized in the assay. 

The DNA encoding that polypeptide is then 
modified by inserting, in frame with the polypeptide 
20 coding sequence, nucleotides encoding a cleavage site 
that is recognized by the exogenous protease to be 
tested. If the cell utilized in the assay is capable of 
cleaving the substrate at its native cleavage site, then 
the nucleotides encoding the polypeptide's native 
25 cleavage site must be altered so as to render it 
uncleavable by endogenous proteases. 

The protease cleavage site in the artificial 
substrate is preferably inserted within. 60 amino acids on 
either side of the native cleavage site. Preferably, the 
30 artificial cleavage site is inserted N-terminal to the 
native cleavage site. Alternatively, the protease 
cleavage site can be created by mutating the native 
polypeptide sequence. Such mutation is preferably 
performed on a sequence within 60 amino acids, more 
35 preferably N-terminal to the native cleavage site and 
within 8-10 amino acids of the native cleavage site; or 
is a mutation of the native cleavage site itself. 
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Alteration of the native cleavage site to 
render it uncleavable by the host cell may be achieved, 
if necessary, by insertion, deletion or mutation of 
nucleotides at that site. 
5 Insertion of the protease cleavage site into 

the substrate and alteration of its native cleavage site 
may be accomplished by any combination of a number of 
recombinant DNA techniques well known in the art, such as 
site directed mutagenesis or standard restriction 

10 digest/ligation cloning techniques. Alternatively, the 
DNA encoding all or part of the artificial substrate may 
be produced synthetically using a commercially available 
automated oligonucleotide synthesizer. Regardless of the 
techniques used to insert the protease cleavage site into 

15 the substrate polypeptide or alter its native cleavage 
site, it is crucial that the reading frame of the 
substrate polypeptide remain intact, without the 
insertion of stop codons. 

The choice of secretable polypeptide from which 

20 the artificial substrate is derived may be selected from 
any pre-polypeptide that can be cleaved by and the 
resulting mature polypeptide secreted out of the host 
cell used for the assay, but is not normally present in 
that cell. For use in eukaryotic cells there are two 

25 main categories of pre-polypeptide from which the choice 
can be made* 

The first and preferred category comprises pre- 
polypeptides that are expressed and cleaved in the 
cytoplasmic compartment. Among these proteins are 

30 interleukin-lfi (IL-lfi), interleukin-la (IL-la), basic 

fibroblast growth factor (bFGF) and endothelial-monocyte 
activating polypeptide II (EMAP-II) . The advantage of 
using cytoplasmic pre-polypeptides is that there is a 
much greater likelihood that the protease and the 

35 artificial substrate will share the same subcellular 

compartment . This is because most proteases of interest 
are also cytoplasmic proteins and thus will have access 
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to the artificial substrate. 

The second category of pre-polypeptides that 
may be used to create artificial substrates used in the 
methods of this invention are those that are expressed on 
5 the cell surface through the organellar secretory pathway 
and are retained on the cell surface. Such substrates 
are useful to assay endogenous and exogenous cell 
membrane proteases, as well as exogenous proteases that 
are similarly engineered to be cell membrane proteins. 
The technique of creating a cell membrane protease or 
substrate involves cloning a leader peptide (i.e., signal 
sequence) onto the N- terminus of the substrate or 
protease and a hydrophobic, membrane anchor sequence 
(either a transmembrane domain or a glycosylphophatidyl- 
15 inositol anchor sequence) onto the C-terminus. The 

resulting substrate is a cell membrane protein with an 
extracellularly located cleavage site. When cleaved by a 
cell membrane protease on the same or a neighboring cell, 
the secreted polypeptide portion of the substrate is 
20 released into the media. 

Examples of sequences that may be used for 
anchoring these proteins in the membrane are the 
transmembrane domains of TNFa precursor [Nedopsasov et 
a1 -' Cold Spri ng Harb. Symp. Quant. Biol. . 51, pp. 611-24 
25 (1986)], SP-C precursor [Keller et al., Biochem J. . 277, 
pp. 493-99 (1991)], or alkaline phosphatase [Berger et 
al-, Proc. Na tl. Acad. Sci. USA . 86, pp. 1457-60 (1989)]. 
Techniques for cloning a signal sequence onto a 
cytoplasmic protein have been well documented [see, for 
example, Kizer and Trosha, BBRC, 174, pp. 586-92 (1991); 
Jost et al., J. Biol. Chem. . 269, pp. 26267-72 (1994) 
(expression and secretion of functional single chain Fv 
molecules using immunoglobulin light chain leader 
sequence); and Sasada et al., Cell Structure Function . 
35 13, pp. 129-41 (1988) (secretion of human EGF and IgE in 
mammalian cells using an IL-2 leader sequence)], as have 
techniques for cloning a transmembrane anchor sequences 
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onto cytoplasmic proteins [Berger et al., supra ; Oda et 
al., Biochem J., 301, pp. 577-83 (1984)]. By combining 
these two techniques, the protease or substrate of 
interest can be converted from a cytoplasmic protein into 
5 a cell surface membrane protein. 

In order to insure that the substrate and 
protease will have access to one another and according to 
an alternate embodiment of the invention, the artificial 
substrate and an exogenous protease to be assayed may be 
encoded as part of a single polyprotein. That 
polyprotein may be a cytoplasmic or a membrane protein, 
as long as the substrate and protease domains reside in 
the same cellular compartment. 

The choice of host cell to use in this method 
15 is virtually unlimited. Any cell that can grow in 

culture, be transformed or transfected with heterologous 
nucleotide sequences and can express those sequence may 
be employed in this method. These include bacteria, such 
as E. coli, Bacillus, yeast and other fungi, plant ?*lls, 
insect cells, mammalian cells. In addition, expression ' 
of either of those sequences in higher eukaryotic host 
cells may be transient or stable. Preferably, the host 
cell is a higher eukaryotic cell that is incapable of 
cleaving the substrate at its native cleavage site. 
25 Preferably, the host cell is a mammalian cell. Most 
preferably, the host cell is a COS cell. 

It will be .apparent that the specific choice of 
cell is governed by the particular protease to be assayed 
and by the particular artificial substrate used. In 
embodiments that assay an exogenous protease, one obvious 
limitation is that the endogenous cellular enzymes of the 
chosen host must be unable to cleave the artificial 
substrate to any significant extent. The endogenous rate 
of artificial substrate cleavage may be determined by 
35 transforming the selected host cell with only the 

nucleotide sequence coding for the artificial substrate 
and then growing that host under conditions which cause 
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expression of that nucleotide sequence and which would 
cause expression of the exogenous protease-encoding 
nucleotide sequence if that sequence were present. The 
growth media of the cell is then assayed for the presence 
5 of the secreted polypeptide portion of the substrate. In 
assays that measure exogenous protease activity, control 
cells (no exogenous protease expressed) should secrete 
less than 10% of the total amount of expressed substrate 
(due to endogenous cleavage and, in assays that do not 

10 distinguish between cleaved and uncleaved substrates, 

leeching of uncleaved substrate out of the cell) in order 
to be useful in the methods of this invention. When an 
endogenous protease is assayed, a controls for non- 
specific substrate cleavage is a cell transformed with a 

15 substrate that contain a mutation at the cleavage site. 
This mutation renders the substrate uncleavable by the 
specific endogenous protease being assayed, but still 
susceptible to non-specific cleavage. As with assays for 
exogenous proteases, control cells should secrete less 

20 than 10% of the total amount of expressed substrate. 

In order to quantitate the protease activity, 
the amount of secreted substrate polypeptide is measured. 
Quantitation may be achieved by subjecting the growth 
media to any of the various standard assay procedures 

25 that are well known in the art. These include, but are 
not limited to, immunoblotting, ELISA, 
immunoprecipitation, RIA, other colorimetric assays, 
enzymatic assay or bioassay. Quantitation techniques 
that employ antibodies, preferably utilize antibodies 

30 that have low cross-reactivity with the uncleaved 

substrate. Preferably cross-reactivity is less than 20% 
and more preferably less than 5%. 

According to another embodiment, the present 
invention provides a method of screening for protease 

35 inhibitors. In this method, the above-described assay is 
carried out in the presence and absence of potential 
inhibitors of the protease. When the assays of this 
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invention are performed using cells which transiently 
express the substrate and protease, the inhibitor is 
preferably added immediately after transfection with the 
protease and substrate-encoding DNA sequences. When 
stable transformants are used, the potential inhibitor is 
added at the beginning of the assay. The efficacy of the 
potential inhibitor (and its ability to cross the cell 
membrane) is determined by comparing the amount of 
secreted substrate polypeptide present in the media of 
cells assayed in its presence versus its absence. 
Compounds which cause at least a 90% reduction in the 
amount of secreted substrate polypeptide are potentially 
useful protease inhibitors. 

In order that the invention described herein 
may be more fully understood, the following examples are 
set forth. It should be understood that these examples 
are for illustrative purposes only and are not to be 
construed as limiting this invention in any manner. 

EXAMPLE 1 
Constructi on Of Expression Plasmids 
A. HCV NS3 Protease 

We cloned the nucleotide sequence coding for 
the entire, intact HCV NS3 protease, an NS3-4A 
polyprotein or a truncated NS3 consisting of amino acids 
25 1 to 180 into the mammalian expression plasmid pcDL-SRo 
[Y. Takebe et al., Mol. Cell. Biol. . 8, pp. 466-72 
(1988)]. That plasmid contains an SV40 origin of 
replication and an HTLV LTR enhancer/promoter sequence 
which ultimately drives the high level expression of the 
NS3 coding sequences (Figure 1) . 

The respective NS-3 coding fragments (full 
length NS3, NS3-4A polyprotein or truncated NS3 (amino 
acids 1-181) were obtained by PCR of the corresponding 
portions of a full length HCV H strain cDNA (SEQ ID 
35 NO:l). For each of the three coding fragments the 
following 5' primer was used (SEQ ID NO: 2) : 

5 ' GGACTAGTCTGCAGTCTAGAGCTCCATGGCGCCCATCACGGCGTACG3 • . The 
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f ragment-specif ic 3 1 primers used were: 
NS3 - (SEQ ID NO: 3) : 

3 1 GAAGATCTGAATTCTAGATTTTACGTGACGACCTCCACGTCGGC5 ' ; 
NS3-4A - (SEQ ID NO: 4) : 

5 3 1 GAAGATCTGAATTCTAGATTTTAGCACTCTTCCATCTCATCGAA5 1 ; and 
NS3 (1-181) - (SEQ ID NO:5): 

3 ' GAAGATCTGAATTCTAGATTTTAGGATCTCATGGTTGTCTCTAGG5 • . These 
primers produced PCR-amplif ied fragments containing 
multiple restriction sites at either end for ease of 
10 cloning . 

In order to ligate the fragments to the vector, 
the vector was first cleaved with PstI and EcoRI to 
remove a small fragment. The cut vector was then 
purified and ligated to the respective Pstl/EcoRI cut NS3 

15 pro tease-encoding fragment. 
B. IL-1B/NS3 Substrate 

A derivative of plasmid pKV containing the pre- 
IL-lfl coding sequence has been described by P. K. Wilson 
et al., Nature, 370, pp. 253-70 (1994). That plasmid 

20 contains the SV40 origin of replication and the early 

promoter. The pre-IL-lfi sequence was cloned between the 
Spel and Bglll sites shown in Figure 2. 

We inserted a double stranded synthetic DNA 
fragment (SEQ ID NO: 6) which encoded 20 amino acids: SEQ 

25 ID NO: 7: GADTEDWCCSMSYTWTGVH and contained linkers at 
both ends that included an ApaLl restriction site. The 
DNA was cloned into the ApaLl site in pre-IL-lfl (between 
the codons for amino acids His n5 and Asp m ) , immediately 
upstream of the native cleavage site (located between 

30 Asp 116 and Ala n7 ) . The first 18 amino acids of the insert 
correspond to the HCV peptide 5A/5B cleavage site. The 
last two amino acids are encoded by the linker. The 
inserted DNA maintained the reading frame of the native 
pre-IL-lfl protein. The resulting substrate is referred 

35 to throughout the application as "pre-IL-lfl*". 

NS3 cleaves the inserted peptide in between the 
cysteine and serine residues. Because the COS cells we 
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utilized in this assay were incapable of cleaving pre-IL- 
1/5 (data not shown), we did not have to knock out the 
native pre-IL-lfl cleavage site. 

In another construct, we performed site 
5 directed mutagenesis to alter the native pre-IL-lfl 
cleavage site of Asp us -Ala lp -Pro nB to Cys-Ser-Met, a 
conserved recognition sequence for NS3. This construct 
is referred to throughout the application as "pre-IL- 
lBB(CSM) ". 

10 C. NS3-4A-A4B-IL-1B 

In order to create a single fusion polypeptide 
that encoded both the exogenous protease and the 
polypeptide substrate, we utilized the fact that NS3 can 
autoprocess (cleave) an NS3-4A-4B polyprotein at both the 
15 NS3-4a and 4A-4B junctions. 

We isolated a DNA fragment that encoded NS3-4A 
and the first 60 amino acids of 4B through PCR using the 
HCV strain H cDNA referred to above (SEQ ID NO:l) and the 
following primers: SEQ ID NO: 8: 

5 ' GGACTAGTCTGCAGTCTAGAGCTCCATGGCGCCCATCACGGCGTACG3 • and 
SEQ ID NO: 9: 3 * GGACGCGGTCTGCAGGAGGCCGAGGGC5 ' . The PCR 
products were digested with PstI and Xbal prior to 
cloning. 

The mature IL-1B portion of the construct 
(amino acids 117-269 of SEQ ID NO: 11) was created by PCR 
cloning of full length pre-IL-lfl cDNA (SEQ ID NO: 10) 
using the following primers: 

SEQ ID NO: 12: 5 ' CTCGGCCTCCTGCAGGCACCTGTACGATCACTGAAC3 * ; 
and SEQ ID NO: 13: 3 ' GGGAATTCTAGATTTTAGGAAGACACAAATTG5 • . 
These PCR products were digested with PstI and EcoRI 
prior to cloning. 

The NS3-4A-A4B and IL-lfl fragments were then 
ligated together with Xbal/EcoRI digested pcDL-SRa to 
obtain the desired construct. 
35 ^ a control we created a mutant NS3 protease 

fusion protein construct. This construct was identical 
to the one described above, except that the NS3 portion 
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was created by PCR using the same primers and the cDNA of 
the NS3 active site mutant S1165A [A. Grakoui et al., J. 
Viro1 • > 67, pp. 2832-43 (1993)]. The NS3 active site ~ 
mutant contains a serine-to-alanine mutation in its 
5 active site, rendering the enzyme inactive. 
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EXAMPLE 2 

Transfectio n Of COS Cells And Assay Of Secreted IL-1B 
The expression plasmid constructs described in 
Example 1 were transfected into COS-7 cells using the 
DEAE-Dextran transfection protocol [Gu et al., Neuron . 5, 
pp. 147-57 (1990) J. COS cells in 6-well clusters or 10o' 
mm dishes at 50% confluency were transfected with 4-10 ug 
of the desired plasmid in a DEAE-Dextran solution. 
Following transfection, the cells were incubated an 
15 additional 48 hours before assaying. 

The processing of pre-IL-lfi or NS3-4A-A4B-IL-1B 
fusion protein and subsequent secretion of mature IL-lfi 
into the media was measured by ELISA of IL-lfl using an 
antibody that was specific for mature IL-1B (approx. 3% 
cross-reactivity with pre-IL-lB) . We analyzed expression 
by harvesting the COS cells in ice-cold phosphate 
buffered saline, lysing the cells in a 0.1% Triton X-100 
buffer and centrifuging the lysate to remove cell debris. 
The lysates were then analyzed by SDS-PAGE and 
immunoblotting using an IL-lfi antibody (Genzyme) and an 
NS3 antibody. Alternatively, expression, processing and 
secretion was analyzed by labelling the cells for 24 
hours in the presence of t"S] -methionine, incubating the 
cells for an additional 24 hours after the label was 
removed and then utilizing immunoprecipitation and SDS- 
PAGE to analyze the polypeptides. 

EXAMPLE 3 

NS3-Specific Processing Of An NS3-4A-A4B-IL-1B Fusion 
Protein And Secretion Of A4B-IL-1B Into The Media 
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Transfectants expressing the NS3-4A-A4B-IL-1B 
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fusion protein autoprocessed that protein at both the 
NS3-4A and 4A-4B junctions. The cell lysates of these 
transfectants were subjected to Western blotting 
utilizing an anti-NS3 antibody. Figure 3, panel A, Wt-1 
5 and Wt-2 lanes , shows that this experiment produced a 

doublet band in the 70 JcD area, present only as a single 
band in the untrans formed control cells (panel A, No DNA 
lane) • The second band of the doublet in the Wt-1 and 
Wt-2 lanes corresponds to the size of mature NS3 . A 

10 transfectant that expressed an inactive mutant NS3- 

containing NS3-4A-A4B-IL-1B fusion protein demonstrated 
no 70 kDa doublet and therefore was not autoprocessed 
(NS3 mutant lane) . A transfectant that co-expressed the 
same mutant fusion protein together with a truncated, but 

15 active NS3 — NS3 (1-180) ~ was also analyzed. 

Surprisingly, the mutant fusion protein did not appear to 
be cleaved by NS3 (1-180), as indicated by the lack of a 
doublet in the 70 kDa region (NS3 mutant + NS3 (1-180) 
lane) . However, a 20 kDa band representing the truncated 

20 NS3 was detected in that lysate, as indicated by the 
NS3 (1-180) arrow. 

A similar experiment performed on cell lysates 
utilizing an mature IL-lB-specif ic antibody demonstrated 
the presence of a band corresponding in size to the A4B- 

25 IL-lfi portion of the fusion protein in both the NS3-4A- 

A4B-IL-1B transfectants (Figure 3, panel B, Wt-1 and Wt-2 
lanes) and, to a lesser degree in the NS3 mutant fusion 
protein/NS3 (1-180) cotransf ectant. Virtually no IL-1B 
was detected in the NS3 mutant fusion protein expressing 

30 transfectant (IL-1B arrow) . These experiments confirm 
that the cleavage observed in the wild type NS3-4A-A4B- 
IL-1B transfectants was dependent upon NS3 protease 
activity. Thus, we had proof that cleavage of this 
fusion protein was essentially NS3-dependent and not 

35 caused by some endogenous protease. 

Secretion of the cleaved substrate was 
determined by assaying culture media with a commercially 
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available mature IL-lfi-specific ELISA assay (R&D Systems, 
Minneapolis, MN) . For the wild-type NS3-containing 
construct we detected a concentration of 2.5 ug/ml of IL- 
IA in the medium. We detected less than 0.25 ug/ml of 
5 IL-1B in the media of cells transfected with the mutant 
NS3-containing construct. Immunoprecipitation experiment 
utilizing the same anti-IL-lfl antibody demonstrated the 
presence of. A4B-IL-1B in the media of cells containing 
the wild type NS3-containing construct, but none from the 
10 mutant NS3-containing construct (Figure 4), thus 
confirming these results. 



20 



EXAMPLE 4 

NS3-Specific Processing Of Mutated Pre-IL-IB 
, - Containing An Artificial Cleavage Site And 

13 Secretion Of IL-1B Into The Media 

We confirmed that NS3 protease can cleave 
artificial substrates other than an HCV polypeptide by 
cotransfecting COS cells with the NS3-4A and either of 
the pre-IL-lfi-containing artificial substrate expression 
constructs described in Example 1C. 

Co-expression of the NS3-4A and pre-IL-lB* 
substrate sequences resulted in rapid cleavage of the 
substrate and concomitant secretion of a 19 Kd IL-lfl into 
the media. Secretion was quantitated using an ELISA 
25 specific for the processed form of IL-1B. An immunoblot 
of cell lysates from these trans formants demonstrated the 
presence of both cleaved and uncleaved substrate (Figure 
5, NS3-4A + IL-Ifl* lane) . The same experiment was 
performed using cells that were metabolically labelled 
with [ 35 S] -methionine, followed by immunoprecipitation of 
the media with the processed IL-lB-specific antibody. 
The results of the immunoprecipitation experiment are 
shown in Figure 6, NS3-4A + pre-IL-lfl* lanes. 

When we coexpressed NS3-4A and the pre-IL- 
35 1B(CSM) sequences, we also observed cleavage of the 

substrate at the predicted Cys n6 -Ser n , site. Both cleaved 
and uncleaved forms were observed in cell lysates using 
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immunoblotting specific for IL-lfi (Figure 5, NS3-4A + IL- 
lfi (CSM) lane). Immune-precipitation of the media from 
[ J5 S] -methionine labelled cells also demonstrated the 
presence IL-lil-containing cleavage product, but less than 
5 that observed for the 5A- SB-containing pre-IL-lB 
substrate (Figure 6, NS3-4A + pre-IL-lfl (CSM) lane). 

EXAMPLE 5 
Assay of NS3 Inhibitors 
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We tested the potential of compounds VH-15924 
and VH-16075 as HCV NS3 protease inhibitors in our 
assays. 

Transfectants expressing the NS3-4A-A4B-IL-1B 
were grown in the presence of varying amounts VH-15924. 
Even at concentrations as high as 100 uM, we detected the 
15 presence of the cleavage product, A4B-IL-1B, in the 

media. This indicated that VH-15924 was not an effective 
inhibitor of NS3 protease. 

We also assayed the inhibition of cleavage and 
secretion of pre-IL-lB* substrate by both VH-15924 and 
VH-16075. VH-16075 inhibited cleavage and secretion with 
an IC 50 of 4 uM. As in the previous experiment, VH-15924 
did not completely inhibit cleavage/secretion even at 
concentrations of 100 uM (Figure 7) . 

While I have hereinbefore presented a number of 
25 embodiments of this invention, it is apparent that my 
basic construction can be altered to provide other 
embodiments which utilize the methods of this invention. 
Therefore, it will be appreciated that the scope of this 
invention is to be defined by the claims appended hereto 
rather than the specific embodiments which have been 
presented hereinbefore by way of example. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

<i) APPLICANT: Su, Michael 

(ii) TITLE OF INVENTION: METHODS AND HOST CELLS FOR ASSAYING 
EXOGENOUS AND ENDOGENOUS PROTEASE ACTIVITY 

(111) NUMBER OF SEQUENCES: 13 

(lv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Fish & Neave 

(B) STREET: 1251 Avenue of the Americas 

(C) CITY: New York 

(D) STATE: New York 

(E) COUNTRY: United States of America 

(F) ZIP: 10020 

+ 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME : Haley Jr, James F 

(B) REGISTRATION NUMBER: 27,794 

(C) REFERENCE/DOCKET NUMBER: VPI/95-01 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 212-596-9000 

(B) TELEFAX: 212-596-9090 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9401 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 3420.. 5312 

(D) OTHER INFORMATION: /product^ "NS3 protease" 

(ix) FEATURE: 

(A) NAME/KEY: raat_peptide 

(B) LOCATION: 5313.. 5474 
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(D) OTHER INFORMATION:. /product= "NS4A" 

(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 5475.. 5552 

(D) OTHER INFORMATION: /product** "truncated NS4B" 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 



GCCAGCCCCC 


TGATGGGGGC 


GACACTCCAC 


CATAGATCAC 


TCCCCTGTGA 


GGAACTACTG 


60 


TCTTCACGCA 


GAAAGCGTCT 


AGCCATGGCG 


TTAGTATGAG 


TGTCGTGCAG 


CCTCCAGGAC 


120 


CCCCCCTCCC 


GGGAGAGCCA 


TAGTGGTCTG 


CGGAACCGGT 


GAGTACACCG 


GAATTGCCAG 


180 


GACGACCGGG 


TCCTTTCTTG 


GATAAACCCG 


CTCAATGCCT 


GGAGATTTGG 


GCGTGCCCCC 


240 


GCAAGACTGC 


TAGCCGAGTA 


GTGTTGGGTC 


GCGAAAGGCC 


TTGTGGTACT 


GCCTGATAGG 


300 


GTGCTTGCGA 


GTGCCCCGGG 


AGGTCTCGTA 


GACCGTGCAC 


CATGAGCACG AATCCTAAAC 


360 


CTCAAAGAAA AACCAAACGT 


AACACCAACC 


GTCGCCCACA 


GGACGTCGAG 


TTCCCGGGTG 


420 


GCGGTCAGAT 


CGTTGGTGGA 


GTTTACTTGT 


TGCCGCGCAG 


GGGCCCTAGA 


TTGGGTGTGC 


480 


GCGCGACGAG 


GAAGACTTCC 


GAGCGGTCGC 


AACCTCGTGG 


TAGACGTCAG 


CCTATCCCCA 


540 


AGGCACGTCG 


GCCCGAGGGC 


AGGACCTGGG 


CTCAGCCCGG 


GTACCCTTGG 


CCCCTCTATG 


600 


GCAATGAGGG 


TTGCGGGTGG 


GCGGGATGGC 


TCCTGTCTCC 


CCGTGGCTCT 


CGGCCTAGCT 


660 


GGGGCCCCAC AGACCCCCGG 


CGTAGGTCGC 


GCAATTTGGG 


TAAGGTCATC 


GATACCCTTA 


720 


CGTGCGGCTT 


CGCCGACCTC ATGGGGTACA TACCGCTCGT 


CGGCGCCCCT 


CTTGGAGGCG 


780 


CTGCCAGGGC 


CCTGGCGCAT 


GGCGTCCGGG 


TTCTGGAAGA 


CGGCGTGAAC 


TATGCAACAG 


840 


GGAACCTTCC 


TGGTTGCTCT 


TTCTCTATCT 


TCCTTCTGGC 


CCTGCTCTCT 


TGCCTGACTG 


900 


TGCCCGCTTC AGCCTACCAA GTGCGCAATT 


CCTCGGGGCT 


TTACCATGTC 


ACCAATGATT 


960 


GCCCTAATTC 


GAGTATTGTG 


TACGAGGCGG 


CCGATGCCAT 


CCTGCACACT 


CCGGGGTGTG 


1020 


TCCCTTGCGT 


TCGCGAGGGT 


AACGCCTCGA 


GGTGTTGGGT 


GGCGGTGACC 


CCCACGGTGG 


1080 


CCACCAGGGA 


CGGCAAACTC 


CCCACAACGC 


AGCTTCGACG 


TCATATCGAT 


CTGCTTGTCG 


1140 


GGAGCGCCAC 


CCTCTGCTCA 


GCCCTCTACG 


TGGGGGACCT 


GTGCGGGTCT 


GTTTTTCTTG 


1200 


TTGGTCAACT 


GTTTACCTTC 


TCTCCCAGGC 


GCCACTGGAC 


GACGCAAAGC 


TGCAATTGTT 


1260 


CTATCTATCC 


CGGCCATATA ACGGGTCATC 


GCATGGCATG 


GGATATGATG 


ATGAACTGGT 


1320 


CCCCTACGGC 


AGCGTTGGTG 


GTAGCTCAGC 


TGCTCCGGAT 


CCCACAAGCC 


ATCATGGACA 


1380 


TGATCGCTGG 


TGCTCACTGG 


GGAGTCCTGG 


CGGGCATAGC 


GTATTTCTCC ATGGTGGGGA 


1440 


ACTGGGCGAA 


GGTCCTGGTA 


GTGCTGCTGC 


TATTTGCCGG 


CGTCGACGCG 


GAAACCCACG 


1500 


TCACCGGGGG AAGTGCCGGC 


CACACCACGG 


CTGGGCTTGT 


TGGTCTCCTT 


ACACCAGGCG 


1560 


CCAAGCAGAA CATCCAACTG ATCAACACCA ACGGCAGTTG 


GCACATCAAT 


AGCACGGCCT 


1620 
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TGAACTGCAA CGATAGCCTT ACCACCGGCT GGTTAGCAGG GCTCTTCTAT CGCCACAAAT 1680 

TCAACTCTTC AGGCTGTCCT GAGAGGTTGG CCAGCTGCCG ACGCCTTACC GATTTTGCCC 1740 

AGGGCTGGGG TCCCATCAGT TATGCCAACG GAAGCGGCCT TGACGAACGC CCCTACTGTT 1800 

GGCACTACCC TCCAAGACCT TGTGGCATTG TGCCCGCAAA GAGCGTGTGT GGCCCGGTAT 1860 

ATTGCTTCAC TCCCAGCCCC GTGGTGGTGG GAACGACCGA CAGGTCGGGC GCGCCTACCT 1920 

ACAGCTGGGG TGCAAATGAT ACGGATGTCT TCGTCCTTAA CAACACCAGG CCACCGCTGG 1980 

GCAATTGGTT CGGTTGTACC TGGATGAACT CAACTGGATT CACCAAAGTG TGCGGAGCGC 2040 

CCCCTTGTGT CATCGGAGGG GTGGGCAACA ACACCTTGCT CTGCCCCACT GATTGCTTCC 2100 

GCAAACATCC GGAAGCCACA TACTCTCGGT GCGGCTCCGG TCCCTGGATT ACACCCAGGT 2160 

GCATGGTCGA CTACCCGTAT AGGCTTT GGC ACTATCCTTG TACTATCAAT TACACCATAT 2220 

TCAAAGTCAG GATGTACGTG GGAGGGGTCG AGCACAGGCT GGAAGCGGCC TGCAACTGGA 2280 

CGCGGGGCGA ACGCTGTGAT CTGGAAGACA GGGACAGGTC CGAGCTCAGC CCATTGCTGC 2340 

TGTCCACCAC ACAGTGGCAG GTCCTTCCGT GTTCTTTCAC GACCCTGCCA GCCTTGTCCA 2400 

CCGGCCTCAT CCACCTCCAC CAGAACATTG TGGACGTGCA GTACTTGTAC GGGGTGGGGT 2460 

CAAGCATCGC GTCCTGGGCC ATTAAGTGGG AGTACGTCGT TCTCCTGTTC CTTCTGCTTG 2520 

CAGACGCGCG CGTCTGCTCC TGCTTGTGGA TGATGTTACT CATATCCCAA GCGGAGGCGG 2580 

CTTTGGAGAA CCTCGTAATA CTCAATGCAG CATCCCTGGC CGGGACGCAC GGTCTTGTGT 2640 

CCTTCCTCGT GTTCTTCTGC TTTGCGTGGT ATCTGAAGGG TAGGTGGGTG CCCGGAGCGG 2700 

. TCTACGCCTT CTACGGGATG TGGCCTCTCC TCCTGCTCCT GCTGGCGTTG CCTCAGCGGG 2760 

CATACGCACT GGACACGGAG GTGGCCGCGT CGTGTGGCGG CGTTGTTCTT GTCGGGTTAA 2820 

TGGCGCTGAC TCTGTCACCA TATTACAAGC GCTATATCAG CTGGTGCATG TGGTGGCTTC 2880 

AGTATTTTCT GACCAGAGTA GAAGCGCAAC TGCACGTGTG GGTTCCCCCC CTCAACGTCC 2940 

GGGGGGGGCG CGATGCCGTC ATCTTACTCA TGTGTGTTGT ACACCCGACT CTGGTATTTG 3000 

ACATCACCAA ACTACTCCTG GCCATCTTCG GACCCCTTTG GATTCTTCAA GCCAGTTTGC 3060 

TTAAAGTCCC CTACTTCGTG CGCGTTCAAG GCCTTCTCCG GATCTGCGCG CTAGCGCGGA 3120 

AGATAGCCGG AGGTCATTAC GTGCAAATGG CCATCATCAA GTTGGGGGCG CTTACTGGCA 3180 

CCTATGTGTA TAACCATCTC ACCCCTCTTC GAGACTGGGC GCACAACGGC CTGCGAGATC 3240 

TGGCCGTGGC TGTGGAACCA GTCGTCTTCT CCCGAATGGA GACCAAGCTC ATCACGTGGG 3300 

GGGCAGATAC CGCCGCGTGC GGTGACATCA TCAACGGCTT GCCCGTCTCT GCCCGTAGGG 3360 

GCCAGGAGAT ACTGCTTGGA CCAGCCGACG GAATGGTCTC CAAGGGGTGG AGGTTGCTGG 3420 

CGCCCATCAC GGCGTACGCC CAGCAGACGA GAGGCCTCCT AGGGTGTATA ATCACCAGCC 3480 

TGACTGGCCG GGACAAAAAC CAAGTGGAGG GTGAGGTCCA GATCGTGTCA ACTGCTACCC 3540 
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AAACCTTCCT 


GGCAACGTGC 


ATCAATGGGG 


TATGCTGGAC 


TGTCTACCAC 


GGGGCCGGAA 


3600 


CGAGGACCAT 


CGCATCACCC 


AAGGGTCCTG 


TCATCCAGAT 


GTATACCAAT 


GTGGACCAAG 


3660 


ACCTTGTGGG 


CTGGCCCGCT 


CCTCAAGGTT 


CCCGCTCATT 


GACACCCTGC 


ACCTGCGGCT 


3720 


CCTCGGACCT 


TTACCTGGTT 


ACGAGGCACG 


CCGACGTCAT 


TCCCGTGCGC 


CGGCGAGGTG 


3780 


ATAGCAGGGG 


TAGCCTGCTT 


TCGCCCCGGC 


CCATTTCCTA 


CCTAAAAGGC 


TCCTCGGGGG 


3840 


CTCCGCTGTT 


GTGCCCCGCG 


GGACACGCCG 


TGGGCCTATT 


CAGGGCCGCG 


GTGTGCACCC 


3900 


GTGGAGTGAC 


CAAGGCGGTG 


GACTTTATCC 


CTGTGGAGAA CCTAGAGACA ACCATGAGAT 


3960 


CCCCGGTGTT 


CACGGACAAC 


TCCTCTCCAC 


CAGCAGTGCC 


CCAGAGCTTC 


CAGGTGGCCC 


4020 


ACCTGCATGC 


TCCCACCGGC 


AGTGGTAAGA 


GCACCAAGGT 


CCCGGCTGCG 


TACGCAGCCC 


4080 


AGGGCTACAA 


GGTGTTGGTG 


CTCAACCCCT 


CTGTTGCTGC 


AACGCTGGGC 


TTTGGTGCTT 


4140 


ACATGTCCAA 


GGCCCATGGG 


GTCGATCCTA 


ATATCAGGAC 


CGGGGTGAGA 


ACAATTACCA 


4200 


CTGGCAGCCC 


CATCACGTAC 


TCCACCTACG 


GCAAGTTCCT 


TGCCGACGGC 


GGGTGCTCAG 


4260 


GAGGCGCTTA 


TGACATAATA 


ATTTGTGACG 


AGTGCCACTC 


CACGGATGCC 


ACATCCATCT 


4320 


TGGGCATCGG 


CACTGTCCTT 


GACCAAGCAG 


AGACTGCGGG 


GGCGAGATTG 


GTTGTGCTCG 


4380 


CCACTGCTAC 


CCCTCCGGGC 


TCCGTCACTG 


TGTCCCATCC 


T AACAT C GAG 


GAGGTTGCTC 


4440 


TGTCCACCAC 


CGGAGAGATC 


CCTTTCTACG 


GCAAGGCTAT 


CCCCCTCGAG 


GTGATCAAGG 


4500 


GGGGAAGACA TCTCATCTTC 


TGTCACTGAA AGAAGAAGTG 


CGACGAGCTC 


GCCGCGAAGC 


4560 


TGGTCGCATT 


GGGCATCAAT 


GCCGTGGCCT 


ACTACCGCGG 


ACTTGACGTG 


TCTGTCATCC 


4620 


CGACCAACGG 


CGATGTTGTC 


GTCGTGTCGA 


CCGATGCTCT 


CATGACTGGC 


TTTACCGGCG 


4680 


ACTTCGACTC 


TGTGATAGAC 


TGCAACACGT 


GTGTCACTCA 


GACAGTCGAT 


TTCAGCCTTG 


4740 


ACCCTACCTT 


TACCATTGAG ACAACCACGC 


TCCCCCAGGA 


TGCTGTCTCC 


AGGACTCAGC 


4800 


GCCGGGGCAG 


GACTGGCAGG 


GGGAAGCCAG 


GCATCTACAG 


ATTTGTGGCA 


CCGGGGGAGC 


4860 


GCCCCTCCGG 


CATGTTCGAC 


TCGTCCGTCC 


TCTGTGAGTG 


CTATGACGCG 


GGCTGTGCTT 


4920 


GGTATGAGCT 


CATGCCCGCC 


GAGACTACAG 


TTAGGCTACG 


AGCGTACATG 


AACACCCCGG 


4980 


GGCTTCCCGT 


GTGCCAGGAC 


CATCTTGAAT 


TTTGGGAGGG 


CGTCTTTACG 


GGCCTCACCC 


5040 


ATATAGATGC 


CCACTTTCTA 


TCCCAGACAA 


AGCAGAGTGG 


GGAGAACTTT 


CCTTACCTGG 


5100 


TAGCGTACCA AGCCACCGTG 


TGCGCTAGGG 


CTCAAGCCCC 


s' 

TCCCCCATCG 


TGGGACCAGA 


5160 


TGTGGAAGTG 


TTTGATCCGC 


CTTAAACCCA 


CCCTCCATGG 


GCCAACACCC 


CTGCTATACA 


5220 


GACTGGGCGC 


TGTTCAGAAT 


GAAGTCACCC 


TGACGCACCC 


AATCACCAAA 


TACATCATGA 


5280 


CATGCATGTC 


GGCCGACCTG 


GAGGTCGTCA 


CGAGCACCTG 


GGTGCTCGTT 


GGCGGCGTCC 


5340 


TGGCTGCTCT 


GGCCGCGTAT 


TGCCTGTCAA 


CAGGCTGCGT 


GGTCATAGTG 


GGCAGGATTG 


5400 


TCTTGTCCGG GAAGCCGGCA ATTATACCTG ACAGGGRGGT 


TCTCTACCAG 


GAGTTCGATG 


5460 
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AGATGGAAGA 


GTGCTCTCAG 


CACTTACCGT 


ACATCGAGCA 


AGGGATGATG 


CTCGCTGAGC 


5520 


AGTTCAAGCA 


GAAGGCCCTC 


GGCCTCCTGC 


AGACCGCGTC 


CCGCCATGCA 


GAGGTT AT CA 


5580 


CCCCTGCTGT 


CCAGACCAAC 


TGGCAGAAAC 


TCGAGGTCTT 


CTGGGCGAAG 


CACATGTGGA 


5640 


ATTTCATCAG 


TGGGATACAA 


TATTTGGCGG 


GCCTGTCAAC 


GCTGCCTGGT 


AACCCCGCCA 


5700 


TTGCTTCATT 


GATGGCTTTT 


ACAGCTGCCG 


TCACCAGCCC 


ACTAACCACT 


GGCCAAACCC 


5760 


TCCTCTTCAA 


CATATTGGGG 


GGGTGGGTGG 


CTGCCCAGCT 


CGCCGCCCCC 


GGTGCCGCTA 


5820 


CCGCCTTTGT 


GGGCGCTGGC 


TTAGCTGGCG 


CCGCCATCGG 


CAGCGTTGGA 


CTGGGGAAGG 


5880 


TCCTCGTGGA 


CATTCTTGCA 


GGGTATGGCG 


CGGGCGTGGC 


GGGAGCTCTT 


GTAGCATTCA 


5940 


AGATCATGAG 


CGGTGAGGTC 


CCCTCCACGG 


AGGACCTGGT 


CAATCTGCTG 


CCCGCCATCC 


6000 


TCTCGCCTGG 


AGCCCTTGTA 


GTCGGTGTGG 


TCTGCGCAGC 


AATACTGCGC 


CGGCACGTTG 


6060 


GCCCGGGCGA 


GGGGGCAGTG 


CAATGGATGA ACCGGCTAAT 


AGCCTTCGCC 


TCCCGGGGGA 


6120 


ACCATGTTTC 


CCCCACGCAC 


TACGTGCCGG 


AGAGCGATGC 


AGCCGCCCGC 


GTCACTGCCA 


6180 


TACTCAGCAG 


CCTCACTGTA 


ACCCAGCTCC 


TGAGGCGACT 


ACATCAGTGG 


ATAAGCTCGG ' 


6240 


AGTGTACCAC 


TCCATGCTCC 


GGCTCCTGGC 


TAAGGGACAT 


CTGGGACTGG 


ATAT GCGAGG 


6300 


TGCTGAGCGA 


CTTTAAGACC 


TGGCTGAAAG 


CCAAGCTCAT 


GCCACAACTG 


CCTGGGATTC 


6360 


CCTTTGTGTC 


CTGCCAGCGC 


GGGTATAGGG 


GGGTCT GGCG 


AGGAGACGGC 


ATTATGCACA 


6420 


CTCGCTGCCA 


CTGTGGAGCT 


GAGATCACTG 


GACATGTCAA 


AAACGGGACG 


ATGAGGATCG 


6480 


TCGGTCCTAG 


GACCTGCAGG 


AACATGTGGA GTGGGACGTT 


CCCCATTAAC 


GCCTACACCA 


6540 


CGGGCCCCTG 


TACTCCCCTT 


CCTGCGCCGA ACTATAAGTT 


CGCGCTGTGG 


AGGGTGTCTG 


6600 


CAGAGGAATA 


CGTGGAGATA 


AGGCGGGTGG 


GGGACTTCCA 


CTACGTATCG 


GGTATGACTA 


6660 


CTGACAATCT 


TAAATGCCCG 


TGCCAGATCC 


CATCGCCCGA 


ATTTTTCACA 


GAATTGGACG 


6720 


GGGTGCGCCT 


ACATAGGTTT 


GCGCCCCCTT 


GCAAGCCCTT 


GCTGCGGGAG 


GAGGTATCAT 


6780 


TCAGAGTAGG 


ACTCCACGAG 


TACCCGGTGG 


GGTCGCAATT 


ACCTTGCGAG 


CCCGAACCGG 


6840 


ACGTAGCCGT 


GTTGACGTCC 


ATGCTCACTG 


ATCCCTCCCA 


TATAACAGCA 


GAGGCGGCCG 


6900 


GGAGAAGGTT 


GGCGAGAGGG 


TCACCCCCTT 


CTATGGCCAG 


CTCCTCGGCC 


AGCCAGCTGT 


6960 


CCGCTCCATC 


TCTCAAGGCA ACTTGCACCG 


CCAACCATGA 


CTCCCCTGAC 


GCCGAGCTCA 


7020 


TAGAGGCTAA 


CCTCCTGTGG 


AGGCAGGAGA 


TGGGCGGCAA 


CATCACCAGG 


GTTGAGTCAG 


7080 


AGAACAAAGT 


GGTGATTCTG 


GACTCCTTCG 


ATCCGCTTGT 


GGCAGAGGAG 


GATGAGCGGG 


7140 


AGGTCTCCGT 


ACCCGCAGAA ATTCTGCGGA AGTCTCGGAG 


ATTCGCCCGG 


GCCCTGCCCG 


7200 


TTTGGGCGCG 


GCCGGACTAC 


AACCCCCCGC 


TAGTAGAGAC 


GTGGAAAAAG 


CCTGACTACG 


7260 


AACCACCTGT 


GGTCCATGGC 


TGCCCGCTAC 


CACCTCCACG 


GTCCCCTCCT 


GTGCCTCCGC 


7320 


CTCGGAAAAA 


GCGTACGGTG 


GTCCTCACCG 


AATCAACCCT 


ACCTACTGCC 


TTGGCCGAGC 


7380 
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TTGCCACGAA AAGTTTTGGC AGCTCCTCAA CTTCCGGCAT 


TACGGGCGAC 


AATATGACAA 


7440 


CATCCTCTGA GCCCGCCCCT TCTGGCTGCC CCCCCGACTC 


CGACGTTGAG 


TCCTATTCTT 


7500 


CCATGCCCCC CCTGGAGGGG GAGCCTGGGG ATCCGGATTT 


CAGCGACGGG 


TCATGGTCGA 


7560 


CGGTCAGTAG TGGGGCCGAC ACGGAAGATG TCGTGTGCTG 


CTCAATGTCT 


TATACCTGGA 


7620 


CAGGCGCACT CGTCACCCCG TGCGCTGCGG AAGAACAAAA ACTGCCCATC AACGCACTGA 


7680 


GCAACTCGTT GCTACGCCAT CACAATCTGG TATATTCCAC 


CACTTCACGC 


AGTGCTTGCC 


7740 


AAAGGCAGAA GAAAGTCACA TTTGACAGAC TGCAAGTTCT 


GGACAGCCAT 


TACCAGGACG 


7800 


TGCTCAAGGA GGTCAAAGCA GCGGCGTCAA AAGTGAAGGC 


TAACTTGCTA 


TCCGTAGAGG 


7860 


AAGCTTGCAG CCTGACGCCC CCACATTCAG CCAAATCCAA 


GTTTGGCTAT 


GGGGCAAAAG 


7920 


ACGTCCGTTG CCATGCCAGA AAGGCCGTAG CCCACATCAA 


CTCCGTGTGG AAAGACCTTC 


7980 


TGGAAGACAG TGTAACACCA ATAGACACTA TCATCATGGC 


CAAGAACGAG 


GTCTTCTGCG 


8040 


TTCAGCCTGA GAAGGGGGGT CGTAAGCCAG CTCGTCTCAT 


CGTGTTCCCC 


GACCTGGGCG 


8100 


TGCGCGTGTG CGAGAAGATG . GCCCTGTACG ACGTGGTTAG 


CAAACTCCCC 


CTGGCCGTGA 


8160 


TGGGAAGCTC CTACGGATTC CAATACTCAC CAGGACAGCG 


GGTTGAATTC 


CTCGTGCAAG 


8220 


CGTGGAAGTC CAAGAAGACC CCGATGGGGT TCCCGTATGA 


TACCCGCTGT 


TTTGACTCCA 


8280 


CAGTCACTGA GAGCGACATC CGTACGGAGG AGGCAATTTA CCAATGTTGT 


GACCTGGACC 


8340 


CCCAAGCCCG CGTGGCCATC AAGTCCCTCA CTGAGAGGCT 


TTATGTTGGG 


GGCCCTCTTA 


8400 


CCAATTCAAG GGGGGAAAAC TGCGGCTATC GCAGGTGCCG 


CGCGAGCGGC 


GTACTGACAA 


8460 


CTAGCTGTGG TAACACCCTC ACTTGCTACA TCAAGGCCCG 


GGCAGCCCGT 


CGAGCCGCAG 


8520 


GGCTCCAGGA CTGCACCATG CTCGTGTGTG GCGACGACTT 


AGTCGTTATC 


TGTGAAAGTG 


8580 


CGGGGGTCCA GGAGGACGCG GCGAGCCTGA GAGCCTTTAC 


GGAGGCTATG 


ACCAGGTACT 


8640 


CCGCCCCCCC CGGGGACCCC CCACAACCAG AATACGACTT 


GGAGCTTATA 


ACATCATGCT 


8700 


CCTCCAACGT GTCAGTCGCC CACGACGGCG CTGGAAAAAG 


GGTCTACTAC 


CTTACCCGTG 


8760 


ACCCTACAAC CCCCCTCGCG AGAGCCGCGT GGGAGACAGC 


AAGACACACT 


CCAGTCAATT 


8820 


CCTGGCTAGG CAACATAATC ATGTTTGCCC CCACACTGTG 


GGCGAGGATG 


ATACTGATGA 


8880 


CCCATTTCTT TAGCGTCCTC ATAGCCAGGG ATCAGCTTGA ACAGGCTCTT AACTGTGAGA 


8940 


TCTACGCAGC CTGCTACTCC ATAGAACCAC TGGATCTACC 


TCCAATCATT 


CAAAGACTCC 


9000 


ATGGCCTCAG CGCATTTTTA CTCCACAGTT ACTCTCCAGG 


TGAAGTCAAT 


AGGGTGGCCG 


9060 


CATGCCTCAG AAAACTTGGG GTCCCGCCCT TGCGAGCTTG 


GAGACACCGG 


GCCCGGAGCG 


9120 


TCCGCGCTAG GCTTCTGTCC AGGGGAGGCA GGGCTGCCAT 


ATGTGGCAAG 


TACCTCTTCA 


9180 


ACTGGGCAGT AAGAACAAAG CTCAAACTCA CTCCAATAGC 


GGCCGCTGGC 


CGGCTGGACT 


9240 


TGTCCGGTTG GTTCACGGCT GGCTACAGCG GGGGAGACAT 


TTATCACAGC 


GTGTCTCATG 


9300 
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CCCGGCCCCG CTGGTTCTGG TTTTGCCTAC TCCTGCTCGC TGCAGGGGTA GGCATCTACC 9360 
TCCTCCCCAA CCGGTGAACG GGGAGCTAGA CACTCCGGCC T 9401 
(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide primer" 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
GGACTAGTCT GCAGTCTAGA GCTCCATGGC GCCCATCACG GCGTACG 47 
(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc « "oligonucleotide primer" 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
CGGCTGCACC TCCAGCAGTG CATTTTAGAT CTTAAGTCTA GAAG 44 
(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide primer" 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
AAGCTACTCT ACCTTCTCAC GATTTTAGAT CTTAAGTCTA GAAG 44 
(2) INFORMATION FOR SEQ ID NO: 5: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide primer" 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
GGATCTCTGT TGGTACTCTA GGATTTTAGA TCTTAAGTCT AGAAG 45 
(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 64 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: both 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "OLIGONUCLEOTIDE DUPLEX" 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 
(v) FRAGMENT TYPE: internal 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 1..4 

(D) OTHER INFORMATION: /product* "SINGLE STRANDED REGION 
ON CODING STRAND" 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 61.. 64 

(D) OTHER INFORMATION: /product= "SINGLE STRANDED REGION 
ON COMPLEMENTARY STRAND" 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

TGCACGGCGC CGACACGGAA GATGTCGTGT GCTGCTCAAT GTCTTATACC TGGACAGGCG 60 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
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(v) FRAGMENT TYPE: internal 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Gly Ala Asp Thr Glu Asp Val Val Cys Cys Ser Met Ser Tyr Thr Trp 
1 5 io 15 

Thr Gly Val His 

20 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide primer" 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
GGACTAGTCT GCAGTCTAGA GCTCCATGGC GCCCATCACG GCGTACG 
(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide primer" 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
CGGGAGCCGG AGGACGTCTG 6CGCAGG 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 1497 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 
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(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION : 87.. 893 

(ix) FEATURE: 

(A) NAME/KEY: raise feature 

(B) LOCATION : 426.7427 

(D) OTHER INFORMATION: /label= ApaLIsite 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
ACCAACCTCT TCGAGGCACA AGGCACAACA GGCTGCTCTG GGATTCTCTT CAGCCAATCT 60 

TCATTGCTCA AGTGTCTGAA GCAGCC ATG GCA GAA GTA CCT GAG CTC GCC AGT 113 

Met Ala Glu Val Pro Glu Leu Ala Ser 
1 5 

GAA ATG ATG GCT TAT TAC AGT GGC AAT GAG GAT GAC TTG TTC TTT GAA 161 
Glu Met Met Ala Tyr Tyr Ser Gly Asn Glu Asp Asp Leu Phe Phe Glu 
10 15 20 25 

GCT GAT GGC CCT AAA CAG ATG AAG TGC TCC TTC GAG GAC CTG GAC CTC 209 
Ala Asp Gly Pro Lys Gin Met Lys Cys Ser Phe Gin Asp Leu Asp Leu 

30 35 40 

TGC CCT CTG GAT GGC GGC ATC CAG CTA CGA ATC TCC GAC CAC CAC TAC 257 
Cys Pro Leu Asp Gly Gly lie Gin Leu Arg lie Ser Asp His His Tyr 

45 50 55 

AGC AAG GGC TTC AGG CAG GCC GCG TCA GTT GTT GTG GCC ATG GAC AAG 305 
Ser Lys Gly Phe Arg Gin Ala Ala Ser Val Val Val Ala Met Asp Lys 
60 65 70 

CTG AGG AAG ATG CTG GTT CCC TGC CCA CAG ACC TTC CAG GAG AAT GAC 353 
Leu Arg Lys Met Leu Val Pro Cys Pro Gin Thr Phe Gin Glu Asn Asp 
75 80 85 

CTG AGC ACC TTC TTT CCC TTC ATC TTT GAA GAA GAA CCT ATC TTC TTC 401 
Leu Ser Thr Phe Phe Pro Phe lie Phe Glu Glu Glu Pro lie Phe Phe 
90 95 100 105 

GAC ACA TGG GAT AAC GAG GCT TAT GTG CAC GAT GCA CCT GTA CGA TCA 449 
Asp Thr Trp Asp Asn Glu Ala Tyr Val His Asp Ala Pro Val Arg Ser 

110 115 120 

CTG AAC TGC ACG CTC CGG GAC TCA CAG CAA AAA AGC TTG GTG ATG TCT 497 
Leu Asn Cys Thr Leu Arg Asp Ser Gin Gin Lys Ser Leu Val Met Ser 

125 130 135 

GGT CCA TAT GAA CTG AAA GCT CTC CAC CTC CAG GGA CAG GAT ATG GAG 545 

Gly Pro Tyr Glu Leu Lys Ala Leu His Leu Gin Gly Gin Asp Met Glu 
140 145 150 

CAA CAA GTG GTG TTC TCC ATG TCC TTT GTA CAA GGA GAA GAA AGT AAT 593 
Gin Gin Val Val Phe Ser Met Ser Phe Val Gin Gly Glu Glu Ser Asn 
155 160 165 

GAC AAA ATA CCT GTG GCC TTG GGC CTC AAG GAA AAG AAT CTG TAC CTG 641 
Asp Lys lie Pro Val Ala Leu Gly Leu Lys Glu Lys Asn Leu Tyr Leu 
170 175 180 185 

TCC TGC GTG TTG AAA GAT GAT AAG CCC ACT CTA CAG CTG GAG AGT GTA 689 
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Ser Cys Val Leu Lys Asp Asp Lys Pro Thr Leu Gin Leu Glu Ser Val 

190 195 200 

GAT CCC AAA AAT TAC CCA AAG AAG AAG ATG GAA AAG CGA TTT GTC TTC 737 
Asp Pro Lys Asn Tyr Pro Lys Lys Lys Met Glu Lys Arg Phe Val Phe 

205 210 215 

AAC AAG ATA GAA ATC AAT AAC AAG CTG GAA TTT GAG TCT GCC CAG TTC 785 
Asn Lys lie Glu lie Asn Asn Lys Leu Glu Phe Glu Ser Ala Gin Phe 
220 225 230 

CCC AAC TGG TAC ATC AGC ACC TCT CAA GCA GAA AAC ATG CCC GTC TTC 833 
Pro Asn Trp Tyr lie Ser Thr Ser Gin Ala Glu Asn Met Pro Val Phe 
235 240 245 

CTG GGA GGG ACC AAA GGC GGC CAG GAT ATA ACT GAC TTC ACC ATG CAA 881 
Leu Gly Gly Thr Lys Gly Gly Gin Asp lie Thr Asp Phe Thr Met Gin 
250 255 260 265 

TTT GTG TCT TCC TAAAGAGAGC TGTACCCAGA GAGTCCTGTG CTGAATGTGG 933 
Phe Val Ser Ser 



ACTCAATCCC 


TAGGGCTGGC 


AGAAAGGGAA 


CAGAAAGGTT 


TTTGAGTACG 


GCTATAGCCT 


993 


GGACTTTCCT 


GTTGTCTACA 


CCAATGCCCA 


ACTGCCTGCC 


TTAGGGTAGT 


GCTAAGAGGA 


1053 


TCTCCTGTCC 


ATCAGCCAGG 


ACAGTCAGCT 


CTCTCCTTTC 


AGGGCCAATC 


CCCAGCCCTT 


1113 


TTGTTGAGCC 


AGGCCTCTCT 


CACCTCTCCT 


ACTCACTTAA 


AGCCCGCCTG 


ACAGAAACCA 


1173 


CGGCCACATT 


TGGTTCTAAG AAACCCTCTG 


TCATTCGCTC 


CCACATTCTG 


ATGAGCAACC 


1233 


GCTTCCCTAT 


TTATTTATTT 


ATTTGTTTGT 


TTGTTTTATT 


CATTGGTCTA 


ATTTATTCAA 


1293 


AGGGGGCAAG 


AAGTAGCAGT 


GTCTGTAAAA 


GAGCCTAGTT 


TTTAATAGCT 


ATGGAATCAA 


1353 


TTCAATTTGG ACTGGTGTGC 


TCTCTTTAAA 


TCAAGTCCTT 


TAATTAAGAC 


TGAAAATATA 


1413 


TAAGCTCAGA 


TTATTTAAAT 


GGGAATATTT 


ATAAATGAGC 


AAATATCATA 


CTGTTCAATG 


1473 


GTTCTGAAAT 


AAACTTCTCT 


GAAG 








1497 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 269 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Met Ala Glu Val Pro Glu Leu Ala Ser Glu Met Met Ala Tyr Tyr Ser 
15 10 15 

Gly Asn Glu Asp Asp Leu Phe Phe Glu Ala Asp Gly Pro Lys Gin Met 

20 25 30 

Lys Cys Ser Phe Gin Asp Leu Asp Leu Cys Pro Leu Asp Gly Gly lie 
35 40 45 
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Gin Leu Arg lie Ser Asp His His Tyr Ser Lys Gly Phe Arg Gin Ala 

50 55 60 

Ala Ser Val Val Val Ala Met Asp Lys Leu Arg Lys Met Leu Val Pro 
65 70 75 80 

Cys Pro Gin Thr Phe Gin Glu Asn Asp Leu Ser Thr Phe Phe Pro Phe 

85 90 95 

He Phe Glu Glu Glu Pro He Phe Phe Asp Thr Trp Asp Asn Glu Ala 

100 105 no 

Tyr Val His Asp Ala Pro Val Arg Ser Leu Asn Cys Thr Leu Arg Asp 
115 120 125 

Ser Gin Gin Lys Ser Leu Val Met Ser Gly Pro Tyr Glu Leu Lys Ala 
130 135 140 

Leu His Leu Gin Gly Gin Asp Met Glu Gin Gin Val Val Phe Ser Met 
145 150 155 160 

Ser Phe Val Gin Gly Glu Glu Ser Asn Asp Lys He Pro Val Ala Leu 

165 170 175 

Gly Leu Lys Glu Lys Asn Leu Tyr Leu Ser Cys Val Leu Lys Asp Asp 

180 185 190 

Lys Pro Thr Leu Gin Leu Glu Ser Val Asp Pro Lys Asn Tyr Pro Lys 
195 200 205 

Lys Lys Met Glu Lys Arg Phe Val Phe Asn Lys He Glu He Asn Asn 
210 215 220 

Lys Leu Glu Phe Glu Ser Ala Gin Phe Pro Asn Trp Tyr He Ser Thr 
225 230 235 240 

Ser Gin Ala Glu Asn Met Pro Val Phe Leu Gly Gly Thr Lys Gly Gly 

245 250 255 

Gin Asp He Thr Asp Phe Thr Met Gin Phe Val Ser Ser 

260 265 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide primer" 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

<Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
CTCGGCCTCC TGCAGGCACC TGTACGATCA CTGAAC 36 
(2) INFORMATION FOR SEQ ID NO: 13: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide primer" 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
GTTAAACACA GAAGGATTTT AGATCTTAAG GG 
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CLAIMS 

I claim: 

1. A method for assaying exogenous protease 
activity in a host cell comprising the steps of: 

(a) incubating a host cell transformed with a 
first nucleotide sequence encoding an exogenous protease and a 
second nucleotide sequence encoding an artificial polypeptide 
substrate; 

wherein said substrate comprises: 

(i) a cleavage site for said exogenous 

protease; and 

(ii) a polypeptide that is secreted out of 
said cell following cleavage by said exogenous protease; 

under conditions which cause said exogenous protease and said 
artificial substrate to be expressed; 

(b) separating said host cell from its growth 
media under non-lytic conditions; and 

fc) assaying said growth media for the 
presence of said secreted polypeptide. 

2. A method for assaying endogenous protease 
activity in a host cell comprising the steps of: 

(a) incubating a host cell transformed with a 
nucleotide sequence encoding an artificial polypeptide 
substrate; 

wherein said substrate comprises: 

(i) a cleavage site for said endogenous 

protease; and 

(ii) a polypeptide that is secreted out of 
said cell following cleavage by said endogenous protease; 

under conditions which cause said artificial substrate to be 
expressed; 

(b) separating said host cell from its growth 
media under non-lytic conditions; and 

(c) assaying said growth media for the 
presence of said secreted polypeptide. 
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3. A method for identifying a compound as an 
inhibitor of a protease comprising the steps of: 

(a) assaying the activity of a protease in the 
absence of said compound by a method according to claim 1 or 
2; 

(b) assaying the activity of a protease in the 
presence of said compound by a method according to claim 1 or 

2, wherein said compound is added to the host cells during 
said incubation of said host cells; and 

(c) comparing the results of step, (a) with the 
results of step (b) . 

4. The method according to claim 1 or claim 3, 
insofar as. it depends from claim 1, wherein said first 
nucleotide sequence and said second nucleotide sequence encode 
a single polypeptide. 

5. The method according to claim 4, wherein said 
first and second nucleotide sequences encode NS3-4A-A4B-IL-1B. 

6. The method according to any one of claims 1 to 

3, wherein said first nucleotide sequence encodes a viral 
protease or an enzymatically active fragment thereof. 

7. The method according to claim 6, wherein said 
first nucleotide sequence encodes hepatitis C virus NS3 
protease, an NS3-4A fusion protein or amino acids 1-180 of NS3 
protease. 

8. The method according to any one of claims 1 to 
3, wherein said secreted polypeptide is selected from 
polypeptides comprising mature IL-1B, mature IL-lcc, basic 
fibroblast growth factor and endothelial-monocyte activating 
polypeptide II. 

9. The method according to claim 8, wherein said 
secreted polypeptide comprises mature IL-1B. 
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10. The method according to claim 9, wherein said 
artificial polypeptide substrate is selected from pre-IL-lB* 
or pre-IL-lB (CSM) . 

11. A host cell transformed with a nucleotide 
sequence encoding an artificial polypeptide substrate, wherein 
said substrate comprises: 

(a) a cleavage site for said exogenous 

protease; and 

(b) a polypeptide that is secreted out of said 
cell following cleavage by said exogenous protease; 

said host cell being capable of expressing said protease and 
said substrate. 

12. A host cell transformed with a first nucleotide 
sequence encoding an exogenous protease and a second 
nucleotide sequence encoding an artificial polypeptide 
substrate, wherein said substrate comprises: 

(a) a cleavage site for said exogenous 

protease; and 

(b) a polypeptide that is secreted out of said 
cell following cleavage by said exogenous protease; 

said host cell being capable of expressing said protease and 
said substrate. 

13. The host cell according to claim 11 or 12, 
wherein said secreted polypeptide is selected from 
polypeptides comprising mature IL-lfi, mature IL-la, basic 
fibroblast growth factor and endothelial-monocyte activating 
polypeptide II. 



14. The host cell according to claim 13, wherein 
said secreted polypeptide comprises mature IL-1B. 
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15. The host cell according to claim 14, wherein 
said artificial polypeptide substrate is selected from pre-IL' 
IS* or pre-IL-lfi(CSM) . 

16. The host ^ell according to claim 12, wherein 
said first nucleotide sequence and said second nucleotide 
sequence encode a single polypeptide. 

17. The host cell according to claim 16, wherein 
said first and second nucleotide sequences encode NS3-4A-A4B- 
IL-1B. 

18. The host cell according to claim 12, wherein 
said first nucleotide sequence encodes a viral protease or an 
enzymatically active fragment thereof. 

19. The host cell according to claim 18, wherein 
said first nucleotide sequence encodes hepatitis C virus NS3 
protease, an NS3-4A fusion protein or amino acids 1-180 of NS3 
protease. 



20. The host cell according to claim 11 or 12, 
selected from E. coli, Bacillus , other bacteria, yeast and 
other fungi, plant cells, insect cells, mammalian cells. 

21. The host cell according to claim 20, wherein 
said host cell is a mammalian cell. 

22. The host cell according to claim 21, wherein 
said host cell is a COS cell. 

23. A recombinant DNA molecule comprising a DNA 
sequence encoding an artificial substrate selected from pre- 
IL-1J5* and pre-IL-lB (CSM) . 
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